IEEE Access
● Institute of Electrical and Electronics Engineers (IEEE)
All preprints, ranked by how well they match IEEE Access's content profile, based on 31 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Sharma, S.; Singh, M.; McDaid, L.; Bhattacharyya, S.
Show abstract
Explainable Artificial Intelligence (XAI) is crucial in healthcare as it helps make intricate machine learning models understandable and clear, especially when working with diverse medical data, enhancing trust, improving diagnostic accuracy, and facilitating better patient outcomes. This paper thoroughly examines the most advanced XAI techniques used in multimodal medical datasets. These strategies include perturbation-based methods, concept-based explanations, and example-based explanations. The value of perturbation-based approaches such as LIME and SHAP in explaining model predictions in medical diagnostics is explored. The paper discusses using concept-based explanations to connect machine learning results with concepts humans can understand. This helps to improve the interpretability of models that handle different types of data, including electronic health records (EHRs), behavioural, omics, sensors, and imaging data. Example-based strategies, such as prototypes and counterfactual explanations, are emphasised for offering intuitive and accessible explanations for healthcare judgments. The paper also explores the difficulties encountered in this field, which include managing data with high dimensions, balancing the tradeoff between accuracy and interpretability, and dealing with limited data by generating synthetic data. Recommendations in future studies focus on improving the practicality and dependability of XAI in clinical settings.
Delgarmi, M.; Heravi, H.; Rahimpour Jounghani, A.; Shahrezie, A.; Ebrahimi, A.; Shamsi, M.
Show abstract
Studying human postural structure is one of the challenging issues among scholars and physicians. The spine is known as the central axis of the body, and due to various genetic and environmental reasons, it could suffer from deformities that cause physical dysfunction and correspondingly reduce peoples quality of life. Radiography is the most common method for detecting these deformities and requires monitoring and follow-up until full treatment; however, it frequently exposes the patient to X-rays and ionization and as a result, cancer risk is increased in the patient and could be highly dangerous for children or pregnant women. To prevent this, several solutions have been proposed using topographic data analysis of the human back surface. The purpose of this research is to provide an entirely safe and non-invasive method to examine the spiral structure and its deformities. Hence, it is attempted to find the exact location of anatomical landmarks on the human back surface, which provides useful and practical information about the status of the human postural structure to the physician. In this study, using Microsoft Kinect sensor, the depth images from the human back surface of 105 people were recorded and, our proposed approach - Deep convolution neural network-was used as a model to estimate the location of anatomical landmarks. In network architecture, two learning processes, including landmark position and affinity between the two associated landmarks, are successively performed in two separate branches. This is a bottom-up approach; thus, the runtime complexity is considerably reduced, and then the resulting anatomical points are evaluated concerning manual landmarks marked by the operator as the benchmark. Our results showed that 86.9% of PDJ and 80% of PCK. According to the results, this study was more effective than other methods with more than thousands of training data.
Chen, J.; Qian, L.; Wang, P.; Sun, C.; Qin, T.; Kalyanasundaram, A.; Zafar, M.; Elefteriades, J.; Sun, W.; Liang, L.
Show abstract
For the machine learning -assisted diagnosis of cardiac diseases, such as thoracic aortic aneurysm, the geometries of the heart and blood vessels need to be reconstructed from medical images, which is usually done by image segmentation followed by meshing. In this study, we applied U-Net (2D and 3D versions), a deep neural network with a U-shaped architecture, to segment human aorta from CT images. From our experiments, we have the following observations: (1) 2D U-Net, which segments each of the 2D slices of a 3D CT image independently, produced erroneous fragments (e.g., missing part of the aorta) and boundaries (i.e., aortic walls) in 3D; (2) 3D U-Net, which does segmentation in 3D regions of a 3D CT image, performed much better than 2D U-Net. We also observe the major weakness of the 3D U-Net: the reconstructed geometries of the aortic wall had large errors (measured by HD95) for some cases. The 3D U-Net in this study serves as a baseline for developing more advanced architectures of deep neural networks for more accurate geometry reconstruction of human aorta.
Arikan, M.; Sallo, F.; Montesel, A.; Ahmed, H.; Hagag, A.; Book, M.; Faatz, H.; Cicinelli, M.; Meshkinfamfard, S.; Ongun, S.; Dubis, A.; Lilaonitkul, W.
Show abstract
Deep learning for medical applications faces many unique challenges. A major challenge is the large amount of labelled data for training, while working in a relatively data scarce environment. Active learning can be used to overcome the vast data need challenge. A second challenged faced is poor performance outside of a experimental setting, contrary to the high requirement for safety and robustness. In this paper, we present a novel framework for estimating uncertainty metrics and incorporating a similarity measure to improve active learning strategies. To showcase effectiveness, a medical image segmentation task was used as an exemplar. In addition to faster learning, robustness was also addressed through adversarial perturbations. Using epistemic uncertainty and our framework, we can cut number of annotations needed by 39% and by 54% using epistemic uncertainty and a similarity metric.
Stringer, C.; Pachitariu, M.
Show abstract
In a recent publication, Ma et al [1] claim that a transformer-based cellular segmentation method called Mediar [2] -- which won a Neurips challenge -- outperforms Cellpose [3] (0.897 vs 0.543 median F1 score). Here we show that this result was obtained by disadvantaging Cellpose in multiple ways. When we removed these impairments, Cellpose outperformed Mediar (0.861 vs 0.826 median F1 score on the updated test set). To further investigate the performance of transformers for cellular segmentation, we replaced the Cellpose backbone with a transformer. The transformer-Cellpose model also did not outperform the standard Cellpose (0.848 median F1 test score). Our results suggest that transformers do not advance the state-of-the-art in cellular segmentation.
Belgaid, A.
Show abstract
This paper presents a deep neural network approach to simulate the pressure of a mechanical ventilator. The traditional mechanical ventilator has a control pressure monitored by a medical practitioner, which could behave inaccurately by missing the proper pressure. This paper exploits recent studies and provides a simulator based on a deep sequence model to predict the airway pressure in the respiratory circuit during the inspiratory phase of a breath given a time series of control parameters and lung attributes. This approach demonstrates the effectiveness of neural network-based controllers in tracking pressure waveforms significantly better than the current industry standard and provides insights to build effective and robust pressure-controlled mechanical ventilators.
Wu, H.-T.; Tolbert, T.; Rapoport, D.
Show abstract
ObjectiveCardiogenic oscillations (CO) in airflow signals contain valuable physiological information. However, accurately isolating CO from airflow signals, particularly in individuals with sleep apnea, remains a challenging signal processing problem. MethodWe introduce the Optimal Shrinkage-aided Airflow Decomposition Algorithm (OSADA), a novel approach for extracting CO from airflow signals while simultaneously recovering a CO-free, noise-free airflow signal, referred to as diaphragm-driven airflow (DDairflow). The algorithms performance is quantitatively evaluated using both a semi-real simulated database and real-world data with benchmark comparisons to existing methods, including the bandpass filter (BPF) and Savitzky-Golay smoothing filters (SGF). ResultFor the semi-real database, OSADA significantly outperforms BPF and SGF across multiple performance indices, including the normalized root mean square error (NRMSE) for CO and DDairflow recovery, as well as spectral energy indices of CO. For real-world data, OSADA also achieves superior performance in the data-driven spectral energy index of CO. ConclusionOSADA is the first algorithm specifically designed for CO recovery from single-channel airflow signals, without relying on additional channels, and is supported by theoretical foundations. Quantitative results suggest robust performance for both CO extraction and DDairflow recovery.
Chen, J.; Wang, W.; Ju, B.; Jiang, J.; Zhang, L.; He, H.; Zhang, X.; Shen, Y.
Show abstract
Accurate pulmonary nodule detection plays an important role in early screening of lung cancer. Although there are many presented CAD systems based on deep learning for pulmonary nodule detection, these methods still have some problems in clinical use. The improvement of false negatives rate of tiny nodules, the reduction of false alarms and the optimization of time consumption are some of them that need to be solved as soon as possible. In view of the above problems, in this paper, we first propose a novel full convolution segmentation framework for lung cavity extraction in preprocessing stage to solve the time consumption problem of the existing pulmonary nodule detection systems. Furthermore, a 2D-NestedUNet segmentation network and a 3D-RPN detection network is stacked to get the high recall and low false positive rate on nodule candidate extraction, especially the recall of tiny nodules. Finally, a false positive reduction method based on multi-model ensemble is proposed for the further classification of nodule candidates. Our methods are evaluated on several public datasets, LUNA16, LNDb and ChestCT2019, which demonstrated the superior performance of our CAD system.
Fu, X.; Zhang, S.; Zhou, J.; Ji, Y.
Show abstract
Automated segmentation of mediastinal neoplasms in preoperative computed tomography (CT) scans is critical for accurate diagnosis. Though convolutional neural networks (CNNs) have proven effective in medical imaging analysis, the segmentation of mediastinal neoplasms, which vary greatly in shape, size, and texture, presents a unique challenge due to the inherent local focus of convolution operations. To address this limitation, we propose a confidence-enhanced semi-supervised learning framework for mediastinal neoplasm segmentation. Specifically, we introduce a confidence-enhanced module that improves segmentation accuracy over indistinct tumor boundaries by assessing and excluding unreliable predictions simultaneously, which can greatly enhance the efficiency of exploiting unlabeled data. Additionally, we implement an iterative learning strategy designed to continuously refine the estimates of prediction reliability throughout the training process, ensuring more precise confidence assessments. Quantitative analysis on a real-world dataset demonstrates that our model significantly improves the performance by leveraging unlabeled data, surpassing existing semi-supervised segmentation benchmarks. Finally, to promote more efficient academic communication, the analysis code is publicly available at https://github.com/fxiaotong432/CEDS. Author summaryIn clinical practice, computed tomography (CT) scans can aid in the detection and evaluation of mediastinal tumors. The early detection of mediastinal tumors plays a crucial role in formulating appropriate treatment plans and improving patient survival rates. To reduce the high cost of manual annotation, researchers have attempted to employ convolutional neural networks (CNNs) for efficient automatic segmentation. However, the significant challenges arise due to the considerable variation in shape, size, and texture of mediastinal tumors, posing difficulties for the segmentation task. In this study, we introduce a confidence-enhanced module with a semi-supervised learning framework. By evaluating the models prediction confidence and selecting high-confidence predictions, we improve the efficiency and quality of data utilization. This approach demonstrates the achievement of accurate mediastinal tumor segmentation with only a minimal amount of labeled data. Our research not only provides an effective technical approach for automatic segmentation of mediastinal tumors but also opens up new possibilities for optimizing strategies in semi-supervised learning methods.
Islam, T.; Hussain, M.; Chowdhury, F. U. H.; Islam, B. M. R.
Show abstract
An outbreak of Monkeypox has been reported in 75 countries so far, and it is spreading at a fast pace around the world. The clinical attributes of Monkeypox resemble those of Smallpox, while skin lesions and rashes of Monkeypox often resemble those of other poxes, for example, Chickenpox and Cowpox. These similarities make Monkeypox detection challenging for healthcare professionals by examining the visual appearance of lesions and rashes. Additionally, there is a knowledge gap among healthcare professionals due to the rarity of Monkeypox before the current outbreak. Motivated by the success of artificial intelligence (AI) in COVID-19 detection, the scientific community has shown an increasing interest in using AI in Monkeypox detection from digital skin images. However, the lack of Monkeypox skin image data has been the bottleneck of using AI in Monkeypox detection. Therefore, in this paper, we used a web-scrapping-based Monkeypox, Chickenpox, Smallpox, Cowpox, Measles, and healthy skin image dataset to study the feasibility of using state-of-the-art AI deep models on skin images for Monkeypox detection. Our study found that deep AI models have great potential in the detection of Monkeypox from digital skin images (precision of 85%). However, achieving a more robust detection power requires larger training samples to train those deep models.
Rafiee, N.; gholamipoorfard, R.; Kollmann, M.
Show abstract
Recent progress in computer-aided technologies has had a considerable impact on helping experts with a reliable and fast diagnosis of abnormal samples. In particular, self-supervised and self-distillation techniques have advanced automated out-of-distribution (OOD) detection in the image domain. Further improvements in OOD detection have been observed by including negative samples derived from shifting transformations of real images. In this work, we study different ways of creating negative samples for medical images and how effective they are when leveraging them in a self-supervised self-distillation framework. We investigate the impact of various types of negative examples by applying different shifting transformations on samples when they are derived from in-distribution training data, from an auxiliary dataset, or a combination of both. For the case of the auxiliary dataset, we compare the OOD detection performance when auxiliary samples are extracted from an indomain or an out-domain. Our approach uses only data belonging to healthy people during the training procedure and does not require any additional information from labels. We demonstrate the efficiency of our technique by comparing abnormality detection performance on diverse medical datasets, setting new benchmarks for pneumonia, polyp, and glaucoma detection from X-ray, colonoscopy, and ophthalmology images.
Mahbub, I.; Zim, A. Z.; Imran, T. B.; Mesbah, K. M. F.; Shawon, M. H.; Jobayer, M.
Show abstract
Colorectal cancer (CRC) remains a leading cause of cancer-related mortality worldwide, with early and accurate detection being critical for improving patient outcomes. Automated image segmentation using deep learning has emerged as a transformative tool for identifying colorectal abnormalities in medical imaging. This study conducts a comparative analysis of three prominent deep learning architectures--U-Net, SegNet, and ResNet--for colorectal cancer image segmentation, evaluating their performance on a custom dataset comprising 1,800 images (1,000 polyp images from the Kvasir-SEG dataset and 800 polyp-free images from the WCE Curated Colon Dataset). The dataset was preprocessed to a uniform resolution of 256 x 256 pixels and partitioned into training, validation, and test sets. Quantitative and qualitative results demonstrate that U-Net outperforms SegNet and ResNet, achieving superior segmentation accuracy (validation accuracy of 0.95) and robustness, particularly when trained on datasets that include negative samples. SegNet showed the sign of overfitting and delivered unstable results, while ResNet struggled to generalize effectively. The integration of negative images improved specificity by decreasing false positive rates. Overall, the results demonstrate U-net as the most efficient in precise polyp segmentation, providing significant implications for robust diagnostic system development.
Singhal, C.; Gupta, N.; Stein, A.; Zhou, Q.; Chen, L.; Shih, G.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWThere has been a steady escalation in the impact of Artificial Intelligence (AI) on Healthcare along with an increasing amount of progress being made in this field. While many entities are working on the development of significant deep learning models for the diagnosis of brain-related diseases, identifying precise images needed for model training and inference tasks is limited due to variation in DICOM fields which use free text to define things like series description, sequence and orientation [1]. Detecting the orientation of brain MR scans (Axial/Sagittal/Coronal) remains a challenge due to these variations caused by linguistic barriers, human errors and de-identification - essentially rendering the tags unreliable [2, 3, 4]. In this work, we propose a deep learning model that identifies the orientation of brain MR scans with near perfect accuracy.
Gavidia, M.; Montanari, A.; Goncalves, J.
Show abstract
Apnea and hypopnea are common sleep disorders characterized by complete or partial obstructions of the airways, respectively. A sleep study, also known as polysomnography (PSG), is typically used to compute the Apnea-Hypopnea Index (AHI), the number of times a person has apnea or certain types of hypopnea per hour of sleep. AHI is then used to diagnose the severity of the sleep disorder. Early detection and treatment of apnea can significantly reduce morbidity and mortality. However, continuous PSG monitoring is unfeasible as it is costly and uncomfortable for patients. To circumvent these issues, we propose a method, named DRIVEN, to estimate AHI at home from wearable devices and assist physicians in diagnosing the severity of apneas. DRIVEN also detects when apnea, hypopnea, periods of wakefulness occur throughout the night, facilitating easy inspection by physicians. Patients can wear a single sensor or a combination of sensors that can be easily measured at home: abdominal movement, thoracic movement, or pulse oximetry. For example, using only two sensors, DRIVEN correctly classifies 72.4% of all test patients into one of the four AHI classes, with 99.3% either correctly classified or placed one class away from the true one. This is a reasonable trade-off between the models performance and patients comfort. We use data from three sleep studies from the National Sleep Research Resource (NSRR), the largest public repository, consisting of 14,370 recordings. DRIVEN is based on a combination of deep convolutional neural networks and a light-gradient-boost machine for classification. Since DRIVEN is simple and computationally efficient, it can be implemented for automatic estimation of AHI in unsupervised long-term home monitoring systems, reducing costs to healthcare systems and improving patient care.
Esteban Lansaque, A.; Sanchez Ramos, C.; Borras, A.; Gil Resina, D.
Show abstract
Medical imaging applications are challenging for machine learning and computer vision methods, in general, for two main reasons: it is difficult to generate reliable ground truth and databases are usually too small in size for training state of the art methods. Virtual images obtained from computer simulations could be used to train classifiers and validate image processing methods if their appearances were comparable (in texture and color) to the actual appearance of intra-operative medical images. Recent works focus on style transfer to generate artistic images by combining the content of an image and the style of another one. A main challenge is the generation of pairs with similar content ensuring preservation of anatomical features, especially across multi-modal data. This paper presents a deep-learning approach to content-preserving style transfer of intra-operative medical data for realistic virtual endoscopy. We propose a multi-objective optimization strategy for Generative Adversarial Networks (GANs) to obtain content-matching pairs that are blended using a siamese u-net architecture (called Content-net) that uses a measure of the content of activations to modulate skip connections. Our approach has been applied to transfer the appearance of bronchoscopic intra-operative videos to virtual bronchoscopies. Experiments assess images in terms of, both, content and appearance and show that our simulated data can substitute intra-operative videos for the design and training of image processing methods.
Lahkar, B. K.; Chaumeil, A.; Dumas, R.; Muller, A.; Robert, T.
Show abstract
In human movement analysis, multibody models are an indispensable part of the process both for marker-based and video-based markerless approaches. Constituents (segments, joint constraints, body segment inertial parameters etc.) of such models and modelers choice play an important role in the accuracy of estimated results (segmental and joint kinematics, segmental and whole-body center of mass positions etc.). For marker-based method, although standard models exist, particularly for the lower extremity (e.g., Conventional Gait Model or models embedded in OpenSim), there seems to be a lack of consolidated explanation on the constituents of the whole-body model. For the markerless approach, multibody kinematic models (e.g., the Theia3D model) have been in use lately. However, there is no clear explanation on the estimated quantities (e.g., joint centers, body surface landmarks etc.) and their relation to the underlying anatomy. This also motivates the need for a description of the markerless multibody model. Moreover, comparing markerless results to those of classical marker-based method is currently the most commonly used approach for evaluation of markerless approaches. This study first aims to develop and describe a whole-body marker-based model ready to be used for human movement analysis. Second, the markerless multibody model embedded in Theia3D is described and inertial parameters are redefined. We also report assessment of the markerless approach compared to marker-based method for a static T-pose performed by 15 subjects. Finally, we disseminate the marker-based and markerless multibody models for their use in Visual3D.
Hatamizadeh, A.; Terzopoulos, D.; Myronenko, A.
Show abstract
Fully convolutional neural networks (CNNs) have proven to be effective at representing and classifying textural information, thus transforming image intensity into output class masks that achieve semantic image segmentation. In medical image analysis, however, expert manual segmentation often relies on the boundaries of anatomical structures of interest. We propose boundary aware CNNs for medical image segmentation. Our networks are designed to account for organ boundary information, both by providing a special network edge branch and edge-aware loss terms, and they are trainable end-to-end. We validate their effectiveness on the task of brain tumor segmentation using the BraTS 2018 dataset. Our experiments reveal that our approach yields more accurate segmentation results, which makes it promising for more extensive application to medical image segmentation.
Qiu, L.; Cai, W.; Yu, J.; Zhong, J.; Wang, Y.; Li, W.; Chen, Y.; Wang, L.
Show abstract
Electrocardiogram (ECG) is an effective and non-invasive indicator for the detection and prevention of arrhythmia. ECG signals are susceptible to noise contamination, which can lead to errors in ECG interpretation. Therefore, ECG pretreatment is important for accurate analysis. In this paper, a method of noise reduction based on deep learning is proposed. The method is divided into two stages, and two corresponding models are formed. In the first stage, a one-dimensional U-net model is designed for ECG signal denoising to eliminate noise as much as possible. The one-dimensional DR-net model in the second stage is used to reconstruct the ECG signal and to correct the waveform distortion caused by noise removal in the first stage. In this paper, the U-net and the DR-net are constructed by the convolution method to achieve end-to-end mapping from noisy ECG signals to clean ECG signals. The ECG data used in this paper are from CPSC2018, and the noise signal is from MIT-BIH Noise Stress Test Database (NSTDB). In the experiment, the improvement in the signal-to-noise ratio SNRimp, the root mean square error decrease RMSEde, and the correlation coefficient P, are used to evaluate the performance of the network. This two-stage method is compared with FCN and U-net alone. The experimental results show that the two-stage noise reduction method can eliminate complex noise in the ECG signal while retaining the characteristic shape of the ECG signal. According to the results, we believe that the proposed method has a good application prospect in clinical practice.
Kondo, S.; Kasai, S.; Hirasawa, K.
Show abstract
In this report, we propose a method for mitosis detection in histopathology images of MIDOG2022 challenge dataset. Our method is based on unsupervised domain adaptation at inputlevel, and a combination of color normalization and object detection. We evaluate our method by using the preliminary test set including 20 cases and obtain 0.716 F1 score.
ji, d.; zhao, y.; zhang, z.; zhao, q.
Show abstract
In view of the large demand for new coronary pneumonia covid19 image recognition samples,the recognition accuracy is not ideal.In this paper,a new coronary pneumonia positive image recognition method proposed based on small sample recognition. First, the CT image pictures are preprocessed, and the pictures are converted into the picture formats which are required for transfer learning. Secondly, perform small-sample image enhancement and expansion on the converted picture, such as miscut transformation, random rotation and translation, etc.. Then, multiple migration models are used to extract features and then perform feature fusion. Finally,the model is adjusted by fine-tuning.Then train the model to obtain experimental results. The experimental results show that our method has excellent recognition performance in the recognition of new coronary pneumonia images,even with only a small number of CT image samples.